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RECOMBINANT OLEOSINS FROM CACAO AND THEIR USE AS FLAVORING OR EMULSIFYING 
AGENTS 

5 

This invention pertains to recombinant genes coding for oleosin proteins in cacao and to the 
polypeptides encoded by said genes. In particular, the present invention relates to the use of 
such genes and gene products for the manufacture of emulsions and flavor. 

10 In a variety of different plants, such as e.g. in soybean, rapeseed or sunflower oily components 
that are insoluble in water, are stored in subcellular structures termed "oil bodies". The oil 
stored in these particles form cellular food reserves that may be mobilized quickly when large 
increases in cellular metabolism are required, such as during seed germination or pollen tube 
growth. Most plant seeds contain stored TAG's (triacylglycerols) as food reserves for 

15 germination and post germination growth, although the level of TAG's stored in seeds varies 
between different plants. 

Intracellular oil bodies of seeds are generally between 0.5 and 2 *tM in diameter (Tzen et al., 
Plant Physiol. 101 (1993), 267-276) and are considered to be composed of a matrix of TAG's 
20 surrounded by a phospholipid layer and associated with a set of different proteins, that are 
called oil body proteins or oleosins. The function of said oleosins is deemed to reside in the 
maintenance of the oil reserves of seeds and pollen in small stable droplets providing a high 
surface to volume ratio which facilitates the rapid conversion of the TAG's into free fatty acids 
via lipase mediated hydrolysis at the oil body surface. 

25 

Genomic clones encoding oleosins have been isolated for two species, namely maize (Browman 
et al., J. Biol. Chem. 265 (1987), 11275 - 11279) and carrot (Hatzopoulos et al., Plant Cell 2 
(1990), 457-467). Moreover, from the cultivated oilseed Brassica napus cDNA clones could be 
obtained and the genomic organisation of the corresponding gene could be verified (Murphy et 
30 al., Biochem. Biophys. Acta 1088 (1991), 86 - 94). However, not any plant is presumed to 
make use of such oleosins and for cacao it was generally held that no such genes/proteins are 
present (Leprince et al., Planta 204 (1998), 109-119 ). 
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Most of the plant seed oil bodies and/or oleosins analyzed to date have been derived from 
seeds that undergo drying during maturation and can be stored safely for long periods under 
dry, low temperature conditions ("orthodox" seeds). To this end, genomic clones encoding 
5 oleosins have been isolated for two species, namely maize (Browman et al., J. Biol. Chem. 
265 (1987), 11275 - 11279) and carrot (Hatzopoulos et al., Plant Cell 2 (1990), 457-467). 
Moreover, Murphy et al. report in Biochem. Biophys. Acta 1088 (1991), 86 - 94 the isolation 
of a cDNA clone and the genomic organisation of oleosin in the cultivated oilseed Brassica 
napus. 

10 

In addition, studies have been carried out on oleosin proteins of two other groups of seeds. 
Seeds that do not undergo desiccation during late maturation and are usually killed' at a high 
water content and low temperatures (recalcitrant seeds; e.g. cacao and red oak) and seeds that 
do undergo desiccation, but are sensitive to storage at temperatures below 0°C ("intermediate" 
15 seeds; coffee and neem) (Leprince et al. (1998) Planta 204, 109-119.). The data presented in 
this report lead to the conclusion that the seeds of red oak had very low levels of oleosin 
proteins while cacao did not seem to have any oleosin proteins at all. In contrast thereto, 
"intermediate" seeds were shown to have both oil bodies by electron microscopy and levels of 
oleosins similar to that observed in "orthodox" seeds, such as the rape seed Brassica napa. 

20 

The known oleosins turned out to be small alkaline proteins having an average weight of about 
15 to 26 kDa and exhibiting an unusually long central hydrophobic region (about > 70 amino 
acids). In an intact oil body within the cell this hydrophobic region is deemed to reside within 
the TAG matrix and anchor the oleosin in the oily central matrix. The N-terminal region of 
25 known oleosin proteins have been found to be rather diverse both in sequence and length. 

Cacao is an important raw material for manufacturers of confectionery and other products, for 
e.g. chocolate. It is known that during fermentation of cacao the existing protease activity in 
the cacao seed results in the formation of an increased level of cacao flavor precursors, such as 
30 hydrophilic peptides and hydrophobic amino acids, which contribute significantly to the 
typical flavor the consumer knows as the cacao flavor (Mohr, W. W., Landschreiber, E., and 
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Severin, T., (1976) Fett. Wissenschaft. Technologie Vol 78 88-95; Voigt, J., Biehl, B., 
Heinrichs, H., Kamaruddin, S., Gaim Marsoner, G., and Hugi, A. 1994 Food Chemistry 49, 
173-180). This increase of flavor precursor peptides and hydrophobic amino acids is dependent 
on the proteolytic activity within the seed during the fermentation process and on the amount 
5 of proteins containing these precursor peptides and hydrophobic amino acids. The progress of 
the fermentation reaction has to be carefully monitored so that the desired cacao flavor 
precursors will be eventually obtained. Also, the raw materials have to be evaluated for flavor 
potential, since cacao seeds deficient in an appropriate amount of proteins containing flavor 
precursor peptides and hydrophobic amino acids will result in a fermented material deficient in 
10 cacao flavor precursors. Consequently, there is a need in the art to provide cacao raw material 
that constantly has a sufficient amount of cacao flavor precursor peptides and hydrophobic 
amino acids. 

The problem of the present invention is therefore to provide means to enhance the flavor 
15 potential of cacao. 

This problem has been solved by providing a recombinant oleosin gene of cacao as identified 
by SEQ ID No 1 and SEQ ID No 2 or variants thereof coding for functional cacao oleosin 
polypeptides. 

20 

The DNA sequence may be incorporated in a vector, preferably in a plasmid, and brought into 
a plant of interest to e.g. overexpress the cacao oleosin gene. 

According to another preferred embodiment the polypeptide encoded by the recombinant 
25 oleosin gene according to this invention exhibits an amino acid sequence as identified by SEQ 
ID No 3 or SEQ ID No 4. 

According to yet another preferred embodiment the cacao oleosin polypeptide may be used for 
preparing emulsifiers or may be used in the preparation of flavor, 

30 

According to yet another preferred embodiment the present invention provides a food product 
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containing an oleosin polypeptide and preferably enzymatically degraded products thereof. 

According to another embodiment cacoa oleosin proteins or derived peptides may be used to 
encapsulate oil soluble molecules for example certain drugs, vitamins and various nutritional 
5 supplements. 

In the figures, 

Fig. 1 shows a SDS PAGE gel of oil body purification, illustrating the protein profile of 
10 different fractions produced during the purification procedure; Aliqouts were taken at different 
stages of the purification of cacao seed oil bodies and run on 10-20% SDS-PAGE minigel and 
then silver stained. Lanel, "floating" oil body material recovered from first centrifugation 
step; Lane 2, "floating" oil body material recovered from first grinding buffer wash; Lane 3, 
"floating" oil body material recovered from partial urea wash; Lane 4-6, "floating" oil body 
15 material recovered from urea washes #1-3; Lane 7 proteins recovered from acetone extracted 
oil bodies; Lane 8, same as Lane 7, but three fold more protein; M, promega mid range 
molecular weight protein markers. 

Fig. 2 shows a Kyte-Doolittle hydrophobicity plot of the 16,9 kDa cacao oleosin protein; 

20 

Fig. 3 shows a Kyte Doolittle hydrophobicity plot of the 15,8 kDa cacao oleosin protein; 

Fig. 4 shows a sequence comparison of the protein sequences SEQ ID No. 3 and SEQ ID No 
4; black regions mark sequences conserved in the two protein sequences. 

25 

During the extensive studies leading to the present invention the present inventors have found 
that contrary to the general belief (Leprince et al., supra) cacao does contain at least two 
different small molecular weight oleosin proteins. The oleosin proteins are synthesized in 
cacao seed and have a calculated molecular weight of about 16.9 kDa and 15.8 kDa, 
30 respectively. 
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Upon identifying the DNA sequence and the putative protein sequence the cacao oleosin 
proteins could be shown to contain regions of both high hydrophobicity and high 
hydrophilicity, as derived from a Kyte Doolittle plot (see Fig. 2 and Fig. 3). During a 
degradative process, such as is prevailing during the fermentation of cacao, said oleosin 
5 proteins will give rise to a number of different peptides and hydrophobic amino acids, which 
will contribute to an enhanced cacao flavor during the manufacture of cocoa products. The 
cacao oleosin genes may therefore be used for expressing or overexpressing, respectively, the 
gene in a suitable system thus being able to provide cacao flavor precursors. 

The oleosin genes may be expressed in a variety of different ways known to the skilled person. 
Such, an expression cassette may be prepared harboring one or more copies of an oleosin gene 
according to the present invention and containing a promotor, suitable to express the gene in a 
given system. The promotor will be selected according to the requirements of the system, in 
which the oleosin gene is to be expressed. 

Such systems include bacterial cells, such as e.g. E.coli, or yeast or insect cells. For each of 
the various expression systems appropriate vectors are known to the skilled person. The 
oleosin proteins produced in such systems may then be isolated from the cells or, in case the 
protein is secreted into the medium, from the culture medium itself. 

According to a preferred embodiment the oleosin gene is expressed in a plant cell, more 
preferably in cacao itself. To this end one or more copies of an oleosin gene of the present 
invention may be introduced into the respective plant cells, which genes may be under the 
control of its endogeneous promotor or under the control of an exogeneous promotor. 
Accordingly, an increased expression of oleosin in the plant cell will be possible. 

In case the oleosin gene is synthesized in cacao itself said cacao may be directly used for the 
preparation of cacao flavor precursors with the result that less raw material will be required for 
obtaining the same degree of flavor precursors as compared to conventional cacao raw 
30 material. 
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As methods for introducing constructs, containing the oleosin gene operabiy linked with an 
appropiate promotor, into plant cells, there may be mentioned electroporation of protoplasts, 
use of bombardment with DNA coated particles or use of known bacterial vectors for plant 
transformation, such as the vectors used with the bacterium Agrobacterium tumefacians . 

5 

After the plant cells are transformed, they may then be regenerated into plants according to 
conventional methods such as is e.g. described in McCormick et al., Plant Cell Rep. 5 (1986), 
81-84. Several generations may be grown and either pollinated with the same transformed 
strain or a different strain, while ensuring that the desired phenotypic trait is maintained. 
10 According to a preferred embodiment cacao trees may be eventually obtained, that exhibit a 
high content of oleosin proteins in their seeds that may serve as a precursor pool for flavor. 

The oleosin proteins obtained as detailed above may also be used as an emulsifier or making 
use of their inherent properties to stabilize small oil droplets in a cacao cell, they may be used 
15 as an encapsulating agent for oil soluble molecules. As examples for the use of the cacao 
oleosin proteins there may be mentioned their use in the food industry for preparing standard 
food emulsions, such as cheese, yogurt, margarine, mayonnaise, vinaigrette, ice cream, salad 
dressing, baking products etc., or their use in the cosmetic industry for producing e.g. soaps, 
skin creams, facial creams, tooth pastes, lipstick, make up etc.. 

20 

The present invention will now be described by means of examples, which are not construed to 
limit the same thereto. 

Example 1 : 
25 Isolation of cacao seed oil bodies 

For the isolation of cacao seed oil bodies, the cacao seeds used were from ripe pods of cacao 
variety EET 95 grown in the green house under open pollination conditions. 

30 The procedure used was a modified version of an oil body isolation procedure developed by 
Millichip et al. (1996) Biochem. J. Vol. 314, 333-337). Eight mature and ungerminated EET 
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95 seeds were taken and their testa and radical were removed. Each seed was then chopped 
into small pieces with a sharp blade at room temperature and the material was immediately put 
into two falcon tubes on ice that each had 30 mis of grind buffer (0.1 M potassium phosphate 
buffer, 25 mM B-mercaptoethanol, 10 mM ascorbic acid, 0.3 M sucrose; final pH to 7.2 with 
5 KOH). The chopped seeds were then homogenized for 45 seconds on ice with an Ultra-Turrax 
T-25 and the larger N-18G head (Janke & Kunkel GmbH & Co KG). The homogenized 
material was quickly filtered through a 500 pM mesh screen keeping the filtrate on ice as much 
has possible. The material remaining on the screen was washed twice with 20 ml of grind 
buffer (supra). The filtrate was subsequently put in 4 clear polycarbonate corex tubes (30 ml) 

10 and centrifuged at 16,000 rpm (20,000 G) at 10°C for 20 minutes. After centrifugation, the top 
"floating material" (oil bodies) was taken off the four tubes with a spatula to new corex tubes 
with fresh grinding buffer. The remaining "floating" material, which becomes suspended at the 
top of the supernatant during handling, was collected as well using a pipette and was 
transferred into fresh grinding buffer. For this first grind buffer wash, the volume was reduced 

15 to approximately 40-50 ml and put into two corex tubes. The "floating" material was 
resuspended by homogenization in the corex tubes on ice for 45 seconds with the Ultra Turrax 
(smaller head N-10G) and respun (20 minutes, 16,000 rpm, 10°C). The top layer was again 
collected and transferred to a new tube as described above, which contained urea wash buffer 
(50 mM Tris-HCl, 9 M urea, 10 mM B-mercatoethanol, final pH 7.2). This partial urea wash 

20 mix was again resuspended by homogenization and centrifuged as in the previous wash step. 

The "floating" material formed after centrifugation in the urea wash buffer was again 
transferred to a new corex tube with urea wash buffer at room temperature and this mix was 
homogenized as described above. The homogenized material was agitated at high speed at 

25 room temperature for 5 minutes, and then centrifuged for 20 minutes at 16,000 rpm, at 20°C. 
After this centrifugation step, the relatively clear wash solution was completely removed by 
pipetting from the corex tube with niinimal loss of floating material and fresh urea wash 
solution was added to the same corex tubes. This method maximized the recovery of the oil 
bodies as the material binds to the tube walls and is lost if a new tube is used for each washing 

30 step. The floating material was rehomogenized, agitated for 15 minutes at room temperature 
and then centrifuged for 20 minutes at 16,000 rpm, at 20°C (first 100% urea wash). This last 
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wash step at 100% urea wash buffer was repeated three times. 

Following the last washing step, the floating material that remained in the two corex tubes after 
removing the urea wash buffer was recovered in 10 mis of urea wash buffer plus 0.025% 

5 Triton X-100 and aliquoted to six 2 ml microcentrifuge tubes. These tubes were spun at 10,000 
rpm for ten minutes and the solution below the "floating" oil bodies was removed. To remove 
the fat from these oil body preparations, 1 ml of acetone was added to the oil bodies in each 
tube. This mixture was then vortexed vigorously and then sonicated 2-4 minutes at room 
temperature. The tubes were then spun at room temperature for 5 minutes at 10,000 rpm. The 

10 supernatants were removed and the acetone extraction procedure was repeated 4 times. 
Finally, the pellets recovered were dried under vacuum in a speed- vac (SAVANT). 



Example 2: 

15 Isolation and Analysis of a Cacao Oil Body Protein by SDS-PAGE and Peptide Sequence 
Analysis 



60 M l of SDS-PAGE gel loading buffer (62.5 mM Tris-HCl pH 6.8, 12.5% glycerol, 2% SDS, 
715 mM 6-mercaptoethanol, 0.025% bromophenol blue) were added to two microcentrifuge 

20 tubes containing acetone extracted oil bodies prepared as described in example 1. This material 
was heated to 50°C, sonicated twice for 5 minutes, vortexed, and then centrifuged. The 
supernatants were combined and then run in three wells of a freshly prepared 20 cm 15 % 
SDS-PAGE gel prepared with duracryl (ESA Chelmsford, Ma. U.S.A.). After migration, the 
gel was fixed twice for 20 minutes in 50% methanol, 10 % acetic acid, and water. The gel was 

25 stained over night with a solution of 45% methanol, 10% acetic acid, and water with 3 mg 
amido black per 100 ml. Then, the gel was rinsed with Milli Q purified water several times. 

Fig. 1 shows a picture of the stained SDS PAGE gel from which it becomes obvious that 
several proteins could be enriched during the oil body isolation procedure. The protein profile 
30 demonstrates that there has been a substantial enrichment of two bands around the size 
expected for oleosins, a major band with an apparent molecular weight of 16.1 kDa and a less 
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intense band with an apparent molecular weight of 15.0 kDa. A larger amount of the same 
sample was then run on a long preparatory SDS PAGE gel. The major band at around 16.1 
kDa was cut out, subjected to trypsin hydrolysis by incubating the gel slice in 200 uL Tris-HCl 
0.05 M pH 8.6, O.Ol % Tween 20 and 0.2 jtg sequencing grade trypsin for 18 hours at 30°C. 
5 The peptides thus obtained were separated on an in-line combination DEAE-C18 HPLC 
column (DEAE - Aquapore, 7um 2.1 x 30 mm from Perkin Elmer; C-18 column catalogue 
#218TP52 2.1 x 250 mm from VYDAC) using a gradient of 2%-45% acetonitrile in Oj'1% 
TFA. One large peptide peak was chosen for N-terminal sequence analysis. 

10 The peptide sequence was performed on a 494 ABI sequencer using Edman degradation 
chemistry according to the manufacturer, and yielded the following sequence 

MetGlnAspMetValGlyTyrValGlyGlnLys 

15 Surprisingly, the sequence obtained showed a high homology to a sequence found in the 16.4 
kDa oleosin of Gossypium hirsutum (cotton, Hughes,D.W., Wang.H.Y. and Galau,G.A. 
(1993) Plant Physiol. 101, 697-698), supporting the assumption that the cacao oil body protein 
investigated was indeed an oleosin protein. 

20 

Example 3: 

Preparation of mRNA from Cacoa Seeds. 

25 The mRNA used for the cDNA library construction was isolated from seeds of an immature 
mostly green pod of EET 95 grown in the green house under open pollination conditions. The 
matrix tissue encasing these seeds was solid and the seeds displayed two very distinct 
developmental stages. One type of seed appeared relatively mature, i.e. the seeds were purple 
with only small amounts of white gelatinous matrix tissue in seed folds, the other seeds were 

30 significantly less mature, with seeds having both white and pink sections and significant 
amounts of the gelatinous matrix material in the seed folds. For RNA isolation, small pieces 
of 3 of the more mature seeds and small pieces of 2 of the less mature seeds were taken as the 
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seeds were freed from the matrix and immediately frozen in liquid nitrogen. This material was 
then ground to a powder in a mortar and pestle in the presence of liquid nitrogen. The liquid 
nitrogen + cacao powder was put in a 50 ml falcon tube and the liquid nitrogen was allowed to 
evaporate. As the powder warmed towards 0 °C, 28 ml of solution A was added (14 ml 100 
5 raM Tris-HCI pH 8 + 14 ml Aqua phenol (Appligene/Oncor) + 0.1% hydroxyquinoline + 
140 Ail 10% SDS, +110 Ail 6-mercaptoethanol). This mixture was homogenized with a glass 
dounce homogenizer on ice. The resulting solution was spun for 10 minutes at 8,000 rpm. The 
aqueous phase was recovered and was manually mixed with 7 ml phenol + 7 ml 
chloroform/isoamyl alcohol (Ready Red, Appligene/Oncor). The extraction was then spun at 
10 8,000 rpm for 10 minutes. After this stage great care was taken to avoid any contamination of 
the sample with RNAse. The aqueous phase recovered was re-extracted twice with 14 ml 
chloroform/isoamyl alcohol. The final aqueous phase obtained was adjusted to 0.3 M Na 
acetate and 2 volumes of EtOH were added. Subsequently, the tube's content was mixed and 
put at -20°C for 1 hour, at -80°C for 15 minutes, and was then spun 30 minutes at 8,000 rpm. 

15 

The nucleic acid pellet recovered was slowly resuspended in 10 ml of 100 mM Tris-HCI pH 8. 
Then, 3 ml of 8 M lithium chloride were added (2 M final) followed by 2 volumes of ethanol. 
This mixture was put 1 hour at -20°C followed by 15 minutes at -80°C. The nucleic acid 
precipitate formed was recovered by centrifugation at 8,000 rpm for 30 minutes. This pellet 

20 was resuspended in 600 (il RNase free HiO and aliquoted into small samples of 200 fi\ which 
were frozen at -80°C. The purity of the isolated RNA was verified by spectral analysis at 
between 220 nm and 300 run and its integrity was demonstrated by showing the integrity of the 
ribosomal RNA sample after running a sample on an RNA gel under the appropriate 
conditions. 

25 Example 4: 

Preparation of a cDNA Library from Cacao Seed mRNA 

Poly A + RNA was prepared using an Oligotex kit (Qiagen) and total cacao seed RNA prepared 
30 as described in example 3. The procedure employed was as described in the instruction leaflet 
for 250-500 ng total RNA. In the final step, the mRNA was eluted with 25 pi preheated elution 
buffer, and the column was then washed with 80 p\ preheated elution buffer. The eluted 
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material was pooled, adjusted to 0.3 M Na-acetate and the RNA was precipitated by adding 
two volumes of ethanol at a temperature of -20°C for one hour and -80°C for 20 minutes. 
The RNA was pelleted by spinning 15 minutes at 13,000 rpm and the pellet was washed with 
70% ethanol and dried under vacuum in a speed vac. The final pellet was resuspended in 10 
5 jd of RNase free water, and the concentration of RNA present was found to be 5-10 ng/ul 
using Nucleic Acid "QuickSticks" (Clontech). 

The synthesis of cDNA from the poly A + mRNA was carried out using a SMART PCR cDNA 
synthesis kit (Clontech). The method used was as described in the kit instructions. For the 

10 first strand cDNA synthesis step, 4jd (20-40 ng) of poly A + mRNA was used and as advised in 
the SMART protocol, 200 units of Gibco BRL Superscript II MMLV reverse transcriptase was 
used. The PCR step of the SMART protocol was also set up as directed in the kit instructions, 
with the proviso that merely 2 p\ of the first strand reaction were added. First, 18 cycles of a 
PCR were run, then, 35 /d was taken out of the total reaction (100 jd) and this 35/d was 

15 subjected to a further 5 cycles of PCR. 

A pool of the two PCR reactions was then prepared, 40 /d of the 18 cycle PCR reaction and 15 
pi of the 23 cycle PCR reaction. 2.5 fxl protease K (Boehringer Mannheim, nuclease free, 14 
fig/fj.1) was added to this cDNA/PCR mixture and the reaction was incubated at 45°C for one 

20 hour. After a brief spin, the reaction was stopped by heating the mixture to 90°C for 8 
minutes. The mix was then chilled on ice, and 5 pi of T4 DNA polymerase was added (3 
units//d) and the reaction was incubated at 14-16°C for 30 minutes. Then, 25 jd of Milli Q 
purified water, 25 /d phenol ("Aqua phenol"), and 25 /d choloroform/isoamyl-alcohol ("Ready 
Red") was added. This mixture was vortexed, spun, and the top aqueous layer was taken. 

25 The phenol layer was reextracted with 50 p\ of H2O. The two resulting aqueous layers were 
then pooled and re-extracted with chloroform/isoamyl-alcohol ("Ready Red"). The DNA in 
aqueous layer recovered was precipitated by adding ethanol and chilling as described above. 
The dried DNA obtained was resuspended in TE buffer (10 mM Tris-HCl pH 8, 1 mM EDTA) 
and its concentration was calculated to be approximately 75 ng/pl using the "QuickSticks" 

30 from Clontech. 
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The cDNA was then prepared for blunt end ligation into the PCR-Script Amp SK(+) cloning 
vector of Stratagene. The method used was as described in the PCR-Script Amp cloning kit 
(Stratagene). First, the "polishing" reaction was carried out as described in the Stratagene 
PCR-Script Amp cloning kit using the cloned thermostable pfu DNA polymerase included in 
5 the kit. This was achieved to ensure that the cDNA were blunt ended before the ligation 
reaction. The DNA thus treated was subsequently purified using the Strataprep PCR 
purification kit (Stratagene). The DNA was eluted from the column with 50 fil of milli Q 
purified water, the DNA was lyophilyzed to dryness, and 6 /il of water were added. One fil of 
this DNA solution was used to assess the final recovery of the cDNA. Then the following 

10 ligation components of the PCR-Script Amp kit were added to the remaining 5 /tl of purified 
cDNA in the tube in which the DNA was dried: 2 pi of pPCR-Script Amp SK(+) cloning 
vector (20 ng), 1 fil PCR-script 10X reaction buffer, 0.5 /xl 10 mM rATP, 1 fi\ Sfrl restriction 
enzyme (5U//d), and 1 fil T4 DNA ligase (4U//d). This mixture was incubated at room 
temperature for 1 hour, then heated to 65°C for 10 minutes. Two fil of the ligated DNA were 

15 transformed into Ultracompetent cells XL-2 Blue (Stratagene) as described in the instruction 
manual for these cells. 

Example 5: 
20 cDNA Library Screening 

The peptide sequence obtained from the gel purified oil body protein (see example 2) was used 
to synthesize one set of degenerate primers that correspond to two overlapping regions of the 
cacoa oleosin peptide sequence. The primers have the following sequence where I is deoxy 
25 inosine 

(1) 5' ITA-ICC-IGC-CAT-ITC-ITG-CAT 3' 

(2) 5' ITT-ITG-ICC-IAC-ITA-ICC-IGC-CAT 3' 

30 Another set of degenerate primers Was designed from two different regions of the 16.4 kDa 
cotton oleosin protein sequence. The two peptide sequences chosen were located N-terminal to 
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the region of the 16.4 kDa cotton oleosin that has high homology to the cocao oleosin peptide 
sequence described here. 

These two sets of primers were synthesized for the screening step. Different pairs of these 
degenerate primers were the tested with PCR amplified cDNA derived from immature seeds of 
T. cacao variety Larringa. One primer set was found to specifically amplify a fragment of 
approximately 300-400 bp from the cacao seed cDNA. 



An initial screen of the cDNA library using degenerate primers indicated that the cocao oleosin 
10 cDNA clone was highly represented in this library. Therefore, plasmid preparations from 19 
isolated transformants were screened directly with the degenerate primer set in Fig. 2, and 
three positive clones were selected for further study. Two clones lcdtc-25 and lcdtc-47 had 
inserts of approximately 0.850-0.950 kb and one clone lcdtc-42 had an insert of approximately 
1-1.1 kb. Clone lcdtc-42 was chosen for further analysis by DNA sequencing. Double strand 
15 DNA sequencing showed that clone lcdtc-42 contained a full length insert of 934 base pairs. 
(SEQ ID No 1). The open reading frame of this insert encodes a protein with a predicted 
molecular weight of 16,885 daltons (SEQ ID No 3). 

Analysis of the 16.9 kDa cacao oleosin cDNA open reading frame showed that this protein is 
20 similar to other oleosins having a very long central hydrophobic domain surrounded by 
hydrophilic N-terminal and C-terminal regions (Fig. 2), and that it is a very basic protein with 
an isoelectric point of 9.734. 



Sequencing of 13 other randomly chosen cDNA clones from this cDNA library also led to the 
25 independent discovery of the 16.9 kDa cacao oleosin cDNA. Furthermore, during this random 
sequencing experiment another cacao oleosin sequence of 775 base pairs was found (SEQ 2). 
This cDNA has an open reading frame that encodes a protein with a calculated molecular 
weight of 15.8 kDa (SEQ ID No 4), and encodes the oil body protein with an apparent 
molecular weight of 15.0 kDa that is seen in Fig. 1. 

30 

The 15.8 kDa cacao oleosin protein sequence also has a very long central hydrophobic domain 
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surrounded by hydrophilic N-terminal and C-terminal regions (Fig. 2), and is a very basic 
protein with an isoelectric point of 9.34. Comparative sequence analysis (Fig. 4) shows that 
the 15.8 kDa cacao cDNA oleosin protein sequence is quite distinct from the 16.9 kDa cacao 
oleosin protein sequence, showing only 43.4% sequence identity with between the two 
proteins. 
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Claims 

1. A recombinant DNA sequence as identified by SEQ ID No. 1 or functional variants 
5 thereof, coding for functional oleosin polypeptides of cocoa. 

2. The recombinant DNA sequence according to claim 1 which is identified by SEQ ID 
No. 2 

10 3 . A vector comprising a DNA sequence according to any of the claims 1 or 2. 

4. The vector according to claim 3, which is a plasmid. 

5. A polypeptide encoded by a DNA sequence according to claim 1 or claim 2. 

15 

6. A polypeptide according to claim 5, which is identified by SEQ ID No 3 or SEQ ID No 
4. 

7. A cell, harboring a recombinant DNA sequence according to any of the claims 1 or 2. 

20 

8. The cell according to claim 7, which is a plant cell. 

9. The cell according to claim 8, which is a cacao cell. 

25 10. A plant part, harboring a recombinant DNA sequence according to any of the claims 1 
or 2, which is a seed. 

11. A plant, harboring a recombinant DNA sequence according to claim 1 or claim 2. 
30 12. The plant according to claim 1 1 , which is a cacao tree. 
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16 

13. Use of a recombinant DNA sequence according to any of the claims 1 or 2 for the 
production of an oleosin polypeptide. 

14. Use of an oleosin polypeptide according to claim 5 for the manufacture of emulsifiers. 

5 

15. Use of an oleosin polypeptide according to claim 5 or claim 6 for the manufacture of 
flavor, preferably cacao flavor. 

16. Use of an oleosin polypeptide according to claim 5 as an encapsulating agent for oil soluble 
10 drugs or oil soluble nutritional supplements. 

17. Food product containing an oleosin polypeptide according to claim 5 or claim 6 or 
enzymatically degraded products thereof. 

15 18. Cosmetic product containing an oleosin polypeptide according to claim 5 or claim 6. 
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Fig. 1 
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SEQUENCE LISTING 

<110> Society des Produits NestlS 

<120> Oleosins in cacao 

<130> Oleosins in cacao 

<140> 
<141> 

<160> 4 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 934 
<212> DNA 
<213> cacao 

<400> 1 

aagcagtggt aacaacgcag agtacgcggg gacctctctt tctctctcac ttttgctgtc 60 
attaacataa tcatttctgc atttgtgaaa gctcataatt taatctctac caatggctga 120 
ccgtgaccgc cctcaccaga ttcaggttca ccaacatcac cgctttgacc agggtggtaa 180 
gaactatcaa tccgctagtg gaccatcagc gacccaggtt ctggctgtgc ttaccctcct 240 
cccagtcggt ggcattctgc ttgcgttagc agggctgacc ctcactggca ccgtcattgg 300 
gctctgtgtg gccacaccac tgttcatcat cttcagcccg gttcttgtcc cagcagccat 360 
tgccgttggc ttggcagtgg ctggtttctt gtcctccggg gctttcgggt tgacggggct 420 
gtcctcactc gcctatgtct ttaatcgcct gaggagggcc accggtacgg agcagctgga 480 
catggaccag gctaagaggc gcatgcagga catggcaggg tatgtaggac agaagactaa 540 
ggaggttgga cagaagatcg agggtaaggc taatgagggt accgtaagga catgaatttg 600 
ataggagggg tacctgcttg catggggagg gcaataaagt gtagtctttt tcattctcaa 660 
ggtgttgtct gtgcagttgt ttgtgtatgt ctggttagcc atactagttg agagatagtg 720 
ggcaatgtaa ttagactctc gtatttgctg tctgtttttg agtttaattt gttcaattcc 780 
atgtatgctt tttctttatc ttaagtcagt ctctctatct cctgtgaaaa agctagtgac 840 
ttccagttaa atctctcaac ccttcagctt tgaacctctt gaatatcaat cacatcatca 900 
aggttcaaaa aaaaaaaaaa aaaaaaaaaa aaaa 934 



<210> 2 
<211> 775 
<212> DNA 
<213> cacao 

<400> 2 

aagcagtggt aacaacgcag agtacgcggg 
ttcattgcca tttttcactg aacactatca 
gatcaaaaca agccgatgac tcagaagctc 



ggcaaccgtc ttccattttc tcacttaaat 60 
tcagaccgag ggccgttcat catgtcgaat 120 
tatgagtcag ctccatcttc gcgccaggcg 180 
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gccaagtttt tgactgcaac 
ttgaccggga cagtgatggc 
attctagtcc cggctggggt 
gggtgtgggg tggcggcgat 
catccaccag gagcggatca 
gacatgacgg agaaggctaa 
gctcaaggat cttgaataag 
gttctgtaag gtggtggtgg 
tgcatacagt gtaggtcatg 
attctctttg tggcttcgaa 



cacactgggt gcaacgctgc 

cctgatcatg gccacgccac 

agtcattttc ctggtgatta 

cacggcgtta tcgtggatat 

gctggattat gcaagaaata 

ggagtatgga caatatgtgc 

agtgtttagc ttagggcttg 

tagtgttgtg tcttgcttgt 

tgtttttggg cttagtaatt 

aatctcgttt ctccaaaaaa 



tattcttgtc tgggttaacc 240 
tcatggtcat tttcagccca 300 
ccgggttctt gttttccggt 360 
ataattacgt gcgagggaaa 420 
cgcttgcgag gacggctagg 480 
agcacaaggc tcaggaggtt 540 
gattgggttg aggtctgttg 600 
tgttttccat catatttgca 660 
gtaacagttg ctttagtttg 720 
aaaaaaaaaa aaaaa 775 



<210> 3 
<211> 160 
<212> PRT 
<213> cacao 

<400> 3 

Met Ala Asp Arg Asp Arg Pro His Gin He Gin Val His Gin His His 
15 10 15 

Arg Phe Asp Gin Gly Gly Lys Asn Tyr Gin Ser Ala Ser Gly Pro Ser 
20 25 30 

Ala Thr Gin Val Leu Ala Val Leu Thr Leu Leu Pro Val Gly Gly He 
35 40 45 

Leu Leu Ala Leu Ala Gly Leu Thr Leu Thr Gly Thr Val He Gly Leu 
50 55 60 

Cys Val Ala Thr Pro Leu Phe He He Phe Ser Pro Val Leu Val Pro 

65 70 75 80 

Ala Ala He Ala Val Gly Leu Ala Val Ala Gly Phe Leu Ser Ser Gly 
85 90 95 

Ala Phe Gly Leu Thr Gly Leu Ser Ser Leu Ala Tyr Val Phe Asn Arg 
100 105 110 

Leu Arg Arg Ala Thr Gly Thr Glu Gin Leu Asp Met Asp Gin Ala Lys 
115 120 125 

Arg Arg Met Gin Asp Met Ala Gly Tyr Val Gly Gin Lys Thr Lys Glu 
130 135 140 



Val Gly Gin Lys He Glu Gly Lys Ala Asn Glu Gly Thr Val Arg Thr 
145 150 155 160 
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<210> 4 
<211> 147 
<212> PRT 
<213> cacao 

<400> 4 

Met Ser Asn Asp Gin Asn Lys Pro Met Thr Gin Lys Leu Tyr Glu Ser 
15 10 15 

Ala Pro Ser Ser Arg Gin Ala Ala Lys Phe Leu Thr Ala Thr Thr Leu 
20 25 30 

Gly Ala Thr Leu Leu Phe Leu Ser Gly Leu Thr Leu Thr Gly Thr Val 
35 40 45 

Met Ala Leu lie Met Ala Thr Pro Leu Met Val lie Phe Ser Pro lie 
50 55 60 

Leu Val Pro Ala Gly Val Val He Phe Leu Val He Thr Gly Phe Leu 
65 70 75 80 

Phe Ser Gly Gly Cys Gly Val Ala Ala He Thr Ala Leu Ser Trp He 
85 90 95 

Tyr Asn Tyr Val Arg Gly Lys His Pro Pro Gly Ala Asp Gin Leu Asp 
100 105 110 

Tyr Ala Arg Asn Thr Leu Ala Arg Thr Ala Arg Asp Met Thr Glu Lys 
115 120 125 

Ala Lys Glu Tyr Gly Gin Tyr Val Gin His Lys Ala Gin Glu Val Ala 
130 135 140 

Gin Gly Ser 
145 
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